Automated selection of nanoparticle models for small-angle X-ray scattering data analysis using machine learning

Many models have been developed for analyzing SAXS data; however choosing the optimal model is difficult and time-consuming, especially for non-expert users. This paper proposes an algorithm, based on machine learning, representation learning and SAXS-specific preprocessing methods, which instantly selects the nanoparticle model best suited to describe SAXS data.


A Data generation A.1 Simulation parameters
This section contains all information about the data set used in this study.The distribution of form factors in the database is balanced with 4.184 Ipqq curves simulated per form factor, which improves the interpretation of the results.As a result, the density of the parameter space varies according to the number of shape factor parameters, but this makes it possible to retain a significant number of simulations for form factors with few parameters.For the 9 form factor used, the following list details how parameters has been chosen.For each occurrence of form factor simulation, variable parameters are drafted following a uniform law.For some parameter, restrictions are added.When mentioned, the parameter is poly-dispersed.The poly-dispersion function is a Gaussian with a full width at half maximum equal to a ˆparam with a randomly selected following a uniform law on [0, 0.3].

A.2 Curves examples' parameters
The parameters used to simulate the curves shown in Figure 1 are as follows: • Core shell cylinder

C Preprocessing selection
Several combinations of preprocessings were tried for each representation space: Table 1 summarizes the main results obtained using Franke space on DS syn xeuss from which the cylinders and core shell cylinders have been removed.

F Experimental data F.1 Fits of experimental data
To better understand the predictor's predictions on the experimental data, it is interesting to evaluate the quality of the fits that can be made with the predicted form factors.We performed fits for each of the experimental curves obtained from the Xeuss, using the form factors most frequently predicted by classification models trained on DS syn xeuss .The fits are represented in appendix F.2 and their obtained χ 2 are as follows: • Sphere n°1: -Fit sphere: χ 2 " 2.68 • Sphere n°2: -Fit sphere: χ 2 " 2.36 -Fit prolate: χ 2 " 1.13 • Sphere n°3: -Fit sphere: χ 2 " 6.91 -Fit prolate: χ 2 " 5.21 • Sphere n°4: -Fit sphere: χ 2 " 1.20 • Sphere n°5: a residual pattern from buffer substraction appear at low q.A sphere and core shell sphere form factor has been used to fit the whole curve, and another fit with sphere form factor has been realized without the beginning of the curve.
-Fit prolate: χ 2 " 1.53 -Fit oblate: χ 2 " 3.90 • Prolate n°2: -Fit prolate: χ 2 " 1.12 • Prolate n°3: -Fit prolate: χ 2 " 1.08   2,3,4,5,6,7,8,9,10,11 present a Transmission Electron Microscopy image of each real sample, corresponding SAXS curve in both device configuration and fits of Xeuss1800HR SAXS curves with various form factors.Some experimental aspect ratios have been measured using the TEM images: particles in the core shell sphere n°1 sample have an average aspect ratio of 1.16 between equatorial radius and polar radius, and then are between our definition of core shell sphere and core shell prolate, so we decided to label them as core shell sphere.In samples sphere n°2 and sphere n°5 the average aspect ratio is 1.10 and these samples are then labelled as sphere.For sample sphere n°1, the average aspect ratio is 1.01 and it is then labelled as sphere.

--•---
Figure1: Example of noiseless I(q) curves generated using the 9 form factors, all particle sizes having the same order of magnitude and all particles having the same scattering length density.

Figures
Figures 2, 3, 4, 5, 6, 7, 8, 9, 10, 11  present a Transmission Electron Microscopy image of each real sample, corresponding SAXS curve in both device configuration and fits of Xeuss1800HR SAXS curves with various form factors.Some experimental aspect ratios have been measured using the TEM images: particles in the core shell sphere n°1 sample have an average aspect ratio of 1.16 between equatorial radius and polar radius, and then are between our definition of core shell sphere and core shell prolate, so we decided to label them as core shell sphere.In samples sphere n°2 and sphere n°5 the average aspect ratio is 1.10 and these samples are then labelled as sphere.For sample sphere n°1, the average aspect ratio is 1.01 and it is then labelled as sphere.
(a) TEM image of sphere n°1 (b) SAXS curves of sphere n°1

Figure 2 :
Figure 2: TEM imaging, SAXS curve recorded on Xenocs devices and fit of the Xeuss1800HR curve for sample sphere n°1

Figure 3 :
Figure 3: TEM imaging, SAXS curve recorded on Xenocs devices and fits of the Xeuss1800HR curve for sample sphere n°2

Figure 4 :
Figure 4: TEM imaging, SAXS curve recorded on Xenocs devices and fits of the Xeuss1800HR curve for sphere n°3

Figure 5 :
Figure 5: TEM imaging, SAXS curve recorded on Xenocs devices and fit of the Xeuss1800HR curve for sphere n°4

Figure 7 :
Figure 7: TEM imaging, SAXS curve recorded on Xenocs devices and fits of the Xeuss1800HR curve for sphere n°6

Table 1 :
Accuracy computed by cross-validation on the data set from which cylinders and core shell cylinders have been removed

Table 2 :
Results from predictors and fits and quality of experimental data